NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Graph-constrained analysis for multivariate functional data

https://doi.org/10.1016/j.jmva.2025.105428

Dey, Debangan; Banerjee, Sudipto; Lindquist, Martin A; Datta, Abhirup (May 2025, Journal of Multivariate Analysis)

The manuscript considers multivariate functional data analysis with a known graphical model among the functional variables representing their conditional relationships (e.g., brain region-level fMRI data with a prespecified connectivity graph among brain regions). Functional Gaussian graphical models (GGM) used for analyzing multivariate functional data customarily estimate an unknown graphical model, and cannot preserve knowledge of a given graph. We propose a method for multivariate functional analysis that exactly conforms to a given inter-variable graph. We first show the equivalence between partially separable functional GGM and graphical Gaussian processes (GP), proposed recently for constructing optimal multivariate covariance functions that retain a given graphical model. The theoretical connection helps to design a new algorithm that leverages Dempster’s covariance selection for obtaining the maximum likelihood estimate of the covariance function for multivariate functional data under graphical constraints. We also show that the finite term truncation of functional GGM basis expansion used in practice is equivalent to a low-rank graphical GP, which is known to oversmooth marginal distributions. To remedy this, we extend our algorithm to better preserve marginal distributions while respecting the graph and retaining computational scalability. The benefits of the proposed algorithms are illustrated using empirical experiments and a neuroimaging application.
more » « less
Free, publicly-accessible full text available May 1, 2026
Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data

Li, Didong; Jones, Andrew; Banerjee, Sudipto; Engelhardt, Barbara E (January 2025, Journal of machine learning research)

Gaussian processes are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Scientific data are often heterogeneous in their inputs and contain multiple known discrete groups of samples; thus, it is desirable to leverage the similarity among groups while accounting for heterogeneity across groups. We propose multi-group Gaussian processes (MGGPs) defined over Rp×C , where C is a finite set representing the group label, by developing general classes of valid (positive definite) covariance functions on such domains. MGGPs are able to accurately recover relationships between the groups and efficiently share strength across samples from all groups during inference, while capturing distinct group-specific behaviors in the conditional posterior distributions. We demonstrate inference in MGGPs through simulation experiments, and we apply our proposed MGGP regression framework to gene expression data to illustrate the behavior and enhanced inferential capabilities of multi-group Gaussian processes by jointly modeling continuous and categorical variables.
more » « less
Full Text Available
Bayesian Modeling with Spatial Curvature Processes

https://doi.org/10.1080/01621459.2023.2177166

Halder, Aritra; Banerjee, Sudipto; Dey, Dipak K (April 2024, Journal of the American Statistical Association)

Spatial process models are widely used for modeling point-referenced variables arising from diverse scientific domains. Analyzing the resulting random surface provides deeper insights into the nature of latent dependence within the studied response. We develop Bayesian modeling and inference for rapid changes on the response surface to assess directional curvature along a given trajectory. Such trajectories or curves of rapid change, often referred to as wombling boundaries, occur in geographic space in the form of rivers in a flood plain, roads, mountains or plateaus or other topographic features leading to high gradients on the response surface. We demonstrate fully model based Bayesian inference on directional curvature processes to analyze differential behavior in responses along wombling boundaries. We illustrate our methodology with a number of simulated experiments followed by multiple applications featuring the Boston Housing data; Meuse river data; and temperature data from the Northeastern United States. Supplementary materials for this article are available online.
more » « less
Full Text Available
Fixed-Domain Asymptotics Under Vecchia's Approximation of Spatial Process Likelihoods

https://doi.org/10.5705/ss.202021.0428

Zhang, Lu; Tang, Wenpin; Banerjee, Sudipto (January 2024, Statistica Sinica)

Full Text Available
Modeling Multivariate Spatial Dependencies Using Graphical Models

https://doi.org/10.51387/23-NEJSDS47

Dey, Debangan; Datta, Abhirup; Banerjee, Sudipto (September 2023, The New England Journal of Statistics in Data Science)

Graphical models have witnessed significant growth and usage in spatial data science for modeling data referenced over a massive number of spatial-temporal coordinates. Much of this literature has focused on a single or relatively few spatially dependent outcomes. Recent attention has focused upon addressing modeling and inference for substantially large number of outcomes. While spatial factor models and multivariate basis expansions occupy a prominent place in this domain, this article elucidates a recent approach, graphical Gaussian Processes, that exploits the notion of conditional independence among a very large number of spatial processes to build scalable graphical models for fully model-based Bayesian analysis of multivariate spatial data.
more » « less
Full Text Available
Bayesian hierarchical modeling and analysis for actigraph data from wearable devices

https://doi.org/10.1214/23-AOAS1742

Alaimo_Di_Loro, Pierfrancesco; Mingione, Marco; Lipsitt, Jonah; Batteate, Christina M; Jerrett, Michael; Banerjee, Sudipto (December 2023, The Annals of Applied Statistics)

The majority of Americans fail to achieve recommended levels of physical activity, which leads to numerous preventable health problems, such as diabetes, hypertension, and heart diseases. This has generated substantial interest in monitoring human activity to gear interventions toward environmental features that may relate to higher physical activity. Wearable devices, such as wrist-worn sensors that monitor gross motor activity (actigraph units) continuously record the activity levels of a subject, producing massive amounts of high-resolution measurements. Analyzing actigraph data needs to account for spatial and temporal information on trajectories or paths traversed by subjects wearing such devices. Inferential objectives include estimating a subject’s physical activity levels along a given trajectory, identifying trajectories that are more likely to produce higher levels of physical activity for a given subject, and predicting expected levels of physical activity in any proposed new trajectory for a given set of health attributes. Here, we devise a Bayesian hierarchical modeling framework for spatial-temporal actigraphy data to deliver fully model-based inference on trajectories while accounting for subject-level health attributes and spatial-temporal dependencies. We undertake a comprehensive analysis of an original dataset from the Physical Activity through Sustainable Transport Approaches in Los Angeles (PASTA-LA) study to ascertain spatial zones and trajectories exhibiting significantly higher levels of physical activity while accounting for various sources of heterogeneity.
more » « less
Full Text Available
Inference for Gaussian processes with Matern covariogram on compact Riemannian manifolds

Li, Didong; Tang, Wenpin; Banerjee, Sudipto (March 2023, Journal of machine learning research)

Full Text Available
Scalable Predictions for Spatial Probit Linear Mixed Models Using Nearest Neighbor Gaussian Processes

https://doi.org/10.6339/22-JDS1073

Saha, Arkajyoti; Datta, Abhirup; Banerjee, Sudipto (November 2022, Journal of Data Science)

Spatial probit generalized linear mixed models (spGLMM) with a linear fixed effect and a spatial random effect, endowed with a Gaussian Process prior, are widely used for analysis of binary spatial data. However, the canonical Bayesian implementation of this hierarchical mixed model can involve protracted Markov Chain Monte Carlo sampling. Alternate approaches have been proposed that circumvent this by directly representing the marginal likelihood from spGLMM in terms of multivariate normal cummulative distribution functions (cdf). We present a direct and fast rendition of this latter approach for predictions from a spatial probit linear mixed model. We show that the covariance matrix of the cdf characterizing the marginal cdf of binary spatial data from spGLMM is amenable to approximation using Nearest Neighbor Gaussian Processes (NNGP). This facilitates a scalable prediction algorithm for spGLMM using NNGP that only involves sparse or small matrix computations and can be deployed in an embarrassingly parallel manner. We demonstrate the accuracy and scalability of the algorithm via numerous simulation experiments and an analysis of species presence-absence data.
more » « less
Full Text Available
A nearest‐neighbour Gaussian process spatial factor model for censored, multi‐depth geochemical data

https://doi.org/10.1111/rssc.12565

Davies, Tilman M.; Banerjee, Sudipto; Martin, Adam P.; Turnbull, Rose E. (August 2022, Journal of the Royal Statistical Society: Series C (Applied Statistics))

Full Text Available
Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains

https://doi.org/10.1080/01621459.2020.1833889

Peruzzi, Michele; Banerjee, Sudipto; Finley, Andrew O. (April 2022, Journal of the American Statistical Association)

Full Text Available

« Prev Next »

Search for: All records